Skip to content

debugger/symdb: add upload metadata fields to upload event message#11329

Open
andreimatei wants to merge 1 commit into
masterfrom
andrei/symdb-upload-fields
Open

debugger/symdb: add upload metadata fields to upload event message#11329
andreimatei wants to merge 1 commit into
masterfrom
andrei/symdb-upload-fields

Conversation

@andreimatei
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei commented May 8, 2026

Add the following fields to the SymDB upload event message that
accompanies each multipart upload (camelCase, matching the rest of
the EvP event schema):

  • "version" (top-level): the service version
  • "language" (top-level): "java"
  • "uploadId" (top-level): a UUID generated once per SymbolSink
    instance, shared by all batches uploaded by the sink
  • "batchNum" (top-level): 1-indexed counter incremented per upload
  • "final" (top-level): always false; the Java tracer continuously
    uploads new code as classes get loaded, so there is no defined
    end-of-upload point
  • "attachmentSize" (top-level): size in bytes of the (compressed)
    attachment payload

Also add the same metadata to the gzipped attachment body via the
ServiceVersion wrapper (snake_case to match the rest of the
attachment scope schema):

  • "upload_id"
  • "batch_num"
  • "final"

uploadId/batchNum are computed once per batch in serializeAndUpload
so both the attachment and the event JSON carry the same values.

Some of these fields are new, to be used by the backend in the future.
Others duplicate info that was already included in the attachment; by
duplicating some metadata out of the SymDB attachment body into the EvP
event body, the backend can populate per-attachment bookkeeping without
downloading the attachment.

@andreimatei andreimatei requested a review from a team as a code owner May 8, 2026 17:40
@andreimatei andreimatei requested review from jpbempel and removed request for a team May 8, 2026 17:40
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a82745bb27

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@andreimatei andreimatei force-pushed the andrei/symdb-upload-fields branch from a82745b to 37aee89 Compare May 8, 2026 17:46
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 8, 2026

Debugger benchmarks

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
ci_job_date 1778281491 1778281837
end_time 2026-05-08T23:06:17 2026-05-08T23:12:03
git_branch master andrei/symdb-upload-fields
git_commit_sha 7e99f00 875f094fb683343cdb2db8c3e987c8588eaee7a3
start_time 2026-05-08T23:04:52 2026-05-08T23:10:38
See matching parameters
Baseline Candidate
ci_job_id 1669639392 1669639392
ci_pipeline_id 112290209 112290209
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
git_commit_date 1778279590 1778279590

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 6 unstable metrics.

See unchanged results
scenario Δ mean agg_http_req_duration_min Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p75 Δ mean agg_http_req_duration_p99 Δ mean throughput
scenario:noprobe unstable
[-18.452µs; +47.427µs] or [-6.415%; +16.487%]
unstable
[-31.447µs; +63.204µs] or [-9.474%; +19.042%]
unstable
[-37.476µs; +72.868µs] or [-10.817%; +21.034%]
unstable
[-288.513µs; +188.886µs] or [-21.837%; +14.297%]
same
scenario:basic same same same unstable
[-436.358µs; -168.901µs] or [-33.428%; -12.939%]
unstable
[-136.485op/s; +136.485op/s] or [-5.459%; +5.459%]
scenario:loop same same same same same
Request duration reports for reports
gantt
    title reports - request duration [CI 0.99] : candidate=None, baseline=None
    dateFormat X
    axisFormat %s
section baseline
noprobe (331.92 µs) : 304, 360
.   : milestone, 332,
basic (294.041 µs) : 288, 300
.   : milestone, 294,
loop (8.987 ms) : 8982, 8993
.   : milestone, 8987,
section candidate
noprobe (347.799 µs) : 291, 405
.   : milestone, 348,
basic (296.403 µs) : 289, 304
.   : milestone, 296,
loop (8.988 ms) : 8982, 8993
.   : milestone, 8988,
Loading
  • baseline results
Scenario Request median duration [CI 0.99]
noprobe 331.92 µs [303.69 µs, 360.15 µs]
basic 294.041 µs [288.079 µs, 300.003 µs]
loop 8.987 ms [8.982 ms, 8.993 ms]
  • candidate results
Scenario Request median duration [CI 0.99]
noprobe 347.799 µs [290.782 µs, 404.816 µs]
basic 296.403 µs [288.639 µs, 304.167 µs]
loop 8.988 ms [8.982 ms, 8.993 ms]

@andreimatei andreimatei force-pushed the andrei/symdb-upload-fields branch from 37aee89 to 875f094 Compare May 8, 2026 22:56
Copy link
Copy Markdown
Member

@jpbempel jpbempel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with few comments

version,
"JAVA",
scopesToSerialize,
uploadId.toString(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be a String field then as the uploadId is generated at the creation of the sink

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm parsing this comment properly, but if you're suggesting turning uploadId into a string -- why? Seems better to me to do the string conversion as late as possible and keep its UUID character wherever possible (in fact I'd rather ideally never convert to string, but ServiceVersion is marshaled to JSON directly and so claude says its uploadId field cannot simply be declared as UUID.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically uploadId at sink creation is constant for all uploads that you will perform, so why converting to string each time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add uploadId to this debug logs. So we can have the value for correlation with what we have on the backend side

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@andreimatei andreimatei force-pushed the andrei/symdb-upload-fields branch from 875f094 to a2a6899 Compare May 11, 2026 18:52
@andreimatei andreimatei added type: enhancement Enhancements and improvements comp: debugger Dynamic Instrumentation labels May 11, 2026
@andreimatei andreimatei force-pushed the andrei/symdb-upload-fields branch from a2a6899 to 70ee312 Compare May 11, 2026 19:06
…d attachment

Add the following fields to the SymDB upload event message that
accompanies each multipart upload (camelCase, matching the rest of
the EvP event schema):

- "version" (top-level): the service version
- "language" (top-level): "java"
- "uploadId" (top-level): a UUID generated once per SymbolSink
  instance, shared by all batches uploaded by the sink
- "batchNum" (top-level): 1-indexed counter incremented per upload
- "final" (top-level): always false; the Java tracer continuously
  uploads new code as classes get loaded, so there is no defined
  end-of-upload point
- "attachmentSize" (top-level): size in bytes of the (compressed)
  attachment payload

Also add the same metadata to the gzipped attachment body via the
ServiceVersion wrapper (snake_case to match the rest of the
attachment scope schema):

- "upload_id"
- "batch_num"
- "final"

uploadId/batchNum are computed once per batch in serializeAndUpload
so both the attachment and the event JSON carry the same values.

Some of these fields are new, to be used by the backend in the future.
Others duplicate info that was already included in the attachment; by
duplicating some metadata out of the SymDB attachment body into the EvP
event body, the backend can populate per-attachment bookkeeping without
downloading the attachment.
@andreimatei andreimatei force-pushed the andrei/symdb-upload-fields branch from 70ee312 to 8d640b0 Compare May 11, 2026 21:42
@andreimatei andreimatei enabled auto-merge May 11, 2026 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: debugger Dynamic Instrumentation type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants